AITopics | min null 2

Collaborating Authors

min null 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Neural Information Processing SystemsFeb-17-2026, 14:56:02 GMT

Cohen et al. (2021) empirically study the evolution of the largest eigenvalue of the loss Hessian, also known as sharpness, along the gradient descent (GD) trajectory and observe the Edge of Stability (EoS) phenomenon.

artificial intelligence, machine learning, trajectory, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Reading (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

fb8fe6b79288f3d83696a5d276f4fc9d-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 01:35:16 GMT

assumption 1, inequality, lemma, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning

Li, Na, Zheng, Zewu, Ni, Wei, Shan, Hangguan, Zhang, Wenjie, Li, Xinyu

arXiv.org Machine LearningDec-2-2025

Multi-agent reinforcement learning (MARL), as a thriving field, explores how multiple agents independently make decisions in a shared dynamic environment. Due to environmental uncertainties, policies in MARL must remain robust to tackle the sim-to-real gap. We focus on robust two-player zero-sum Markov games (TZMGs) in offline settings, specifically on tabular robust TZMGs (RTZMGs). We propose a model-based algorithm (\textit{RTZ-VI-LCB}) for offline RTZMGs, which is optimistic robust value iteration combined with a data-driven Bernstein-style penalty term for robust value estimation. By accounting for distribution shifts in the historical dataset, the proposed algorithm establishes near-optimal sample complexity guarantees under partial coverage and environmental uncertainty. An information-theoretic lower bound is developed to confirm the tightness of our algorithm's sample complexity, which is optimal regarding both state and action spaces. To the best of our knowledge, RTZ-VI-LCB is the first to attain this optimality, sets a new benchmark for offline RTZMGs, and is validated experimentally.

algorithm, probability, rtzmg, (16 more...)

arXiv.org Machine Learning

2512.00352

Country:

North America > United States (0.14)
Oceania > Australia > New South Wales (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Government (0.46)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Neural Information Processing SystemsOct-9-2025, 09:54:50 GMT

artificial intelligence, machine learning, trajectory, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Reading (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Online Riemannian PCA for Stochastic Canonical Correlation Analysis: Supplementary Material

Neural Information Processing SystemsAug-15-2025, 05:50:38 GMT

Nonlinear mean shift over riemannian manifolds.

exp 1, manifold, stochastic canonical correlation analysis, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Aligned Multi Objective Optimization

Efroni, Yonathan, Kretzu, Ben, Jiang, Daniel, Bhandari, Jalaj, Zheqing, null, Zhu, null, Ullrich, Karen

arXiv.org Artificial IntelligenceMar-3-2025

To date, the multi-objective optimization literature has mainly focused on conflicting objectives, studying the Pareto front, or requiring users to balance tradeoffs. Yet, in machine learning practice, there are many scenarios where such conflict does not take place. Recent findings from multi-task learning, reinforcement learning, and LLMs training show that diverse related tasks can enhance performance across objectives simultaneously. Despite this evidence, such phenomenon has not been examined from an optimization perspective. This leads to a lack of generic gradient-based methods that can scale to scenarios with a large number of related objectives. To address this gap, we introduce the Aligned Multi-Objective Optimization framework, propose new algorithms for this setting, and provide theoretical guarantees of their superior performance compared to naive approaches.

curvature, min null 2, objective, (13 more...)

arXiv.org Artificial Intelligence

2502.14096

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

Wang, Yudan, Wang, Yue, Zhou, Yi, Zou, Shaofeng

arXiv.org Machine LearningJun-3-2024

Actor-critic (AC) is a powerful method for learning an optimal policy in reinforcement learning, where the critic uses algorithms, e.g., temporal difference (TD) learning with function approximation, to evaluate the current policy and the actor updates the policy along an approximate gradient direction using information from the critic. This paper provides the \textit{tightest} non-asymptotic convergence bounds for both the AC and natural AC (NAC) algorithms. Specifically, existing studies show that AC converges to an $\epsilon+\varepsilon_{\text{critic}}$ neighborhood of stationary points with the best known sample complexity of $\mathcal{O}(\epsilon^{-2})$ (up to a log factor), and NAC converges to an $\epsilon+\varepsilon_{\text{critic}}+\sqrt{\varepsilon_{\text{actor}}}$ neighborhood of the global optimum with the best known sample complexity of $\mathcal{O}(\epsilon^{-3})$, where $\varepsilon_{\text{critic}}$ is the approximation error of the critic and $\varepsilon_{\text{actor}}$ is the approximation error induced by the insufficient expressive power of the parameterized policy class. This paper analyzes the convergence of both AC and NAC algorithms with compatible function approximation. Our analysis eliminates the term $\varepsilon_{\text{critic}}$ from the error bounds while still achieving the best known sample complexities. Moreover, we focus on the challenging single-loop setting with a single Markovian sample trajectory. Our major technical novelty lies in analyzing the stochastic bias due to policy-dependent and time-varying compatible function approximation in the critic, and handling the non-ergodicity of the MDP due to the single Markovian sample trajectory. Numerical results are also provided in the appendix.

equation, function approximation, non-asymptotic analysis, (12 more...)

arXiv.org Machine Learning

2406.01762

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Utah (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

min null 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

758a06618c69880a6cee5314ee42d52f-Supplemental.pdf

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

fb8fe6b79288f3d83696a5d276f4fc9d-Supplemental-Conference.pdf

Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

An Online Riemannian PCA for Stochastic Canonical Correlation Analysis: Supplementary Material

Aligned Multi Objective Optimization

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation